Robust digit recognition using phase-dependent time-frequency masking

نویسندگان

  • Guangji Shi
  • Parham Aarabi
چکیده

A technique using the time-frequency phase information of two microphones is proposed to estimate an ideal timefrequency mask using time-delay-of-arrival (TDOA) of the signal of interest. At a signal-to-noise ratio (SNR) of 0dB, the proposed technique using two microphones achieves a digit recognition rate (average over 5 speakers, each speaking 20-30 digits) of 71%. In contrast, delayand-sum beamforming only achieves a 40% recognition rate with two microphones and 60% with four microphones. Superdirective beamforming achieves a 44% recognition rate with two microphones and 65% with four microphones.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Robust speech separation using time-frequency masking

A multi-microphone time-frequency speech masking technique is proposed. This technique utilizes both the timefrequency magnitude and phase information in order to estimate the Signal-to-Noise Ratio (SNR) maximizing masking coefficients for each time-frequency block given that the direction (or alternatively, the time-delay of arrival) of the speaker of interest is known. Using this masking algo...

متن کامل

Stereo-input speech recognition using sparseness-based time-frequency masking in a reverberant environment

We present noise robust automatic speech recognition (ASR) using sparseness-based underdetermined blind source separation (BSS) technique. As a representative underdetermined BSS method, we utilized time-frequency masking in this paper. Although time-frequency masking is able to separate target speech from interferences effectively, one should consider two problems. One is that masking does not...

متن کامل

Robust Fuzzy Gain-Scheduled Control of the 3-Phase IPMSM

This article presents a fuzzy robust Mixed - Sensitivity Gain - Scheduled H controller based on the Loop -Shaping methodology for a class of MIMO uncertain nonlinear Time - Varying systems. In order to design this controller, the nonlinear parameter - dependent plant is first modeled as a set of linear subsystems by Takagi and Sugeno’s (T - S) fuzzy approach. Both Loop - Shaping methodology and...

متن کامل

Forward masking on a generalized logarithmic scale for robust speech recognition

This paper examines the forward masking on the generalized logarithmic scale for robust speech recognition to both additive and convolutional noise. The forward masking in the dynamic cepstral (DyC) representation is based upon subtraction of a masking pattern from a current spectrum on a logarithmic spectral domain, whereas the proposed method intends to make a compromise between the logarithm...

متن کامل

Noise Robust Speech Recognition Using Prosodic Information

This paper proposes a noise robust speech recognition method for Japanese utterances using prosodic information. In Japanese, the fundamental frequency (F0) contour conveys phrase intonation and word accent information. Consequently, it also conveys information about prosodic phrase and word boundaries. This paper first proposes a noise robust F0 extraction method using the Hough transform, whi...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2003